## [1] 263 4
## id name rgb is_trans
## Min. : -1.0 Length:263 Length:263 Length:263
## 1st Qu.: 83.0 Class :character Class :character Class :character
## Median :1005.0 Mode :character Mode :character Mode :character
## Mean : 651.4
## 3rd Qu.:1070.5
## Max. :9999.0
## [1] 60456 4
## element_id part_num color_id design_id
## Min. : 9327 Length:60456 Min. : -1.0 Min. : 1001
## 1st Qu.:4565425 Class :character 1st Qu.: 10.0 1st Qu.: 18454
## Median :6111350 Mode :character Median : 28.0 Median : 41748
## Mean :5517587 Mean : 120.4 Mean : 45570
## 3rd Qu.:6286413 3rd Qu.: 85.0 3rd Qu.: 75474
## Max. :6499141 Max. :9999.0 Max. :107520
## [1] 37265 3
## id version set_num
## Min. : 1 Min. : 1.000 Length:37265
## 1st Qu.: 14424 1st Qu.: 1.000 Class :character
## Median : 54379 Median : 1.000 Mode :character
## Mean : 61104 Mean : 1.091
## 3rd Qu.: 88842 3rd Qu.: 1.000
## Max. :194312 Max. :16.000
## [1] 20858 3
## inventory_id fig_num quantity
## Min. : 3 Length:20858 Min. : 1.000
## 1st Qu.: 7869 Class :character 1st Qu.: 1.000
## Median : 15681 Mode :character Median : 1.000
## Mean : 43010 Mean : 1.062
## 3rd Qu.: 66834 3rd Qu.: 1.000
## Max. :194312 Max. :100.000
## [1] 1180987 6
## inventory_id part_num color_id quantity
## Min. : 1 Length:1180987 Min. : -1.0 Min. : 1.00
## 1st Qu.: 9404 Class :character 1st Qu.: 4.0 1st Qu.: 1.00
## Median : 22838 Mode :character Median : 15.0 Median : 2.00
## Mean : 50849 Mean : 131.8 Mean : 3.37
## 3rd Qu.: 87088 3rd Qu.: 71.0 3rd Qu.: 4.00
## Max. :194312 Max. :9999.0 Max. :3064.00
## is_spare img_url
## Length:1180987 Length:1180987
## Class :character Class :character
## Mode :character Mode :character
##
##
##
## [1] 4358 3
## inventory_id set_num quantity
## Min. : 35 Length:4358 Min. : 1.000
## 1st Qu.: 8076 Class :character 1st Qu.: 1.000
## Median : 16423 Mode :character Median : 1.000
## Mean : 52519 Mean : 1.813
## 3rd Qu.: 98685 3rd Qu.: 1.000
## Max. :191576 Max. :60.000
## [1] 13764 4
## fig_num name num_parts img_url
## Length:13764 Length:13764 Min. : 0.000 Length:13764
## Class :character Class :character 1st Qu.: 4.000 Class :character
## Mode :character Mode :character Median : 4.000 Mode :character
## Mean : 5.296
## 3rd Qu.: 5.000
## Max. :156.000
## [1] 66 2
## id name
## Min. : 1.00 Length:66
## 1st Qu.:19.25 Class :character
## Median :35.50 Mode :character
## Mean :35.36
## 3rd Qu.:51.75
## Max. :68.00
## [1] 29977 3
## rel_type child_part_num parent_part_num
## Length:29977 Length:29977 Length:29977
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## [1] 52615 4
## part_num name part_cat_id part_material
## Length:52615 Length:52615 Min. : 1.00 Length:52615
## Class :character Class :character 1st Qu.:17.00 Class :character
## Mode :character Mode :character Median :41.00 Mode :character
## Mean :38.91
## 3rd Qu.:60.00
## Max. :68.00
## [1] 21880 6
## set_num name year theme_id
## Length:21880 Length:21880 Min. :1949 Min. : 1
## Class :character Class :character 1st Qu.:2001 1st Qu.:273
## Mode :character Mode :character Median :2012 Median :497
## Mean :2008 Mean :442
## 3rd Qu.:2018 3rd Qu.:608
## Max. :2024 Max. :752
## num_parts img_url
## Min. : 0.0 Length:21880
## 1st Qu.: 3.0 Class :character
## Median : 31.0 Mode :character
## Mean : 161.4
## 3rd Qu.: 139.0
## Max. :11695.0
## [1] 323 3
## id name parent_id
## Min. : 3.0 Length:323 Min. : 1.0
## 1st Qu.:205.0 Class :character 1st Qu.:186.0
## Median :469.0 Mode :character Median :411.0
## Mean :419.9 Mean :360.6
## 3rd Qu.:632.5 3rd Qu.:512.5
## Max. :751.0 Max. :697.0
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
## Selecting by n
## `summarise()` has grouped output by 'name'. You can override using the
## `.groups` argument.
Conclusions:
The plot shows a clear upward trend in the average number of unique colors used in Lego sets from around the 1950s to the present. Notably, there is a significant increase starting in the early 2000s, where the average number of unique colors per set rises more steeply compared to previous decades.
This trend could be indicative of Lego’s strategy to make sets more appealing and varied, perhaps in response to market demands for more intricate and visually stimulating products.
Conclusions:
There are distinct spikes observed in certain years, which could indicate special editions or series of sets that included more minifigures, or perhaps a general increase in the inclusion of minifigures in sets during those times. Following each spike, there is often a drop, which may suggest a return to the norm.
After 2010, there appears to be a downward trend, suggesting that recent sets might be including fewer minifigures on average.
Conclusion:
The heatmap suggests that there is a positive correlation between the size of a Lego set (as measured by the number of parts) and the color diversity within the set (as measured by the number of unique colors). Sets that have a higher number of parts tend also to have a higher number of different colors.
Complexity of a set can be achieved by approximate the number of unique part categories used in each set.
## `geom_smooth()` using formula = 'y ~ x'
## [1] 0.5391932
Conclusion:
The scatter plot reveals a positive correlation between the number of parts and set complexity. As the number of parts in a set increases, the number of unique part categories tends to increase as well, suggesting that larger sets are generally more complex. This relationship seems to hold strongly for sets with a smaller number of parts, as indicated by the dense cluster of points toward the origin, where the increase in complexity with the number of parts is quite pronounced.
For sets with a very high number of parts (toward the right end of the X-axis), the data points become more spread out, indicating more variability in complexity for these larger sets. It suggests that once a set reaches a certain size, the addition of more parts does not necessarily increase complexity at the same rate. This could be due to the use of repeated parts within these large sets or a design choice to not increase complexity despite a higher part count.
I will use Prophet ML model to predict the number of released sets from the current year up to 2030.
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
## ds yhat yhat_lower yhat_upper
## 52 2024-01-01 403.7481 286.1416 526.2445
## 53 2025-01-01 412.1880 290.6553 547.6149
## 54 2026-01-01 419.8156 291.9226 539.2507
## 55 2027-01-01 429.3019 303.3481 559.2465
## 56 2028-01-01 440.6468 312.1446 569.9043
## 57 2029-01-01 449.0867 321.5536 571.4541
## 58 2030-01-01 456.7144 336.6848 582.3091
## 59 2031-01-01 466.2007 332.0993 595.2625
## 60 2032-01-01 477.5456 355.7787 603.0191
## 61 2033-01-01 485.9855 364.6903 623.4119
Conclusion:
The prediction suggests a continued increase in the number of Lego sets released each year.
If the trend observed in the past continues without significant change, we might expect to see a rise in the number of Lego sets released annually up to 2030. However, predictions should be taken with caution due to the potential impact of unforeseen future events.